Update the default Gunicorn API server workers count to one #1454

ssheng · 2021-02-16T11:26:55Z

Description

Update the default API GUnicorn server workers count to one. Originally, the workers count was automatically determined by the number of available CPU cores.

Motivation and Context

The automatic determination was not ideal because users were often unaware the large number of workers and large number of workers could result in additional memory overhead from models. Users will still have the ability to set BentoML to automatically determine the number of workers by explicit configuration.

How Has This Been Tested?

./dev/format.sh
./dev/lint.sh
./ci/unit_test.sh

Types of changes

Breaking change (fix or feature that would cause existing functionality to change)
New feature and improvements (non-breaking change which adds/improves functionality)
Bug fix (non-breaking change which fixes an issue)
Code Refactoring (internal change which is not user facing)
Documentation
Test, CI, or build

Component(s) if applicable

BentoService (service definition, dependency management, API input/output adapters)
Model Artifact (model serialization, multi-framework support)
Model Server (mico-batching, dockerisation, logging, OpenAPI, instruments)
YataiService gRPC server (model registry, cloud deployment automation)
YataiService web server (nodejs HTTP server and web UI)
Internal (BentoML's own configuration, logging, utility, exception handling)
BentoML CLI

Checklist:

My code follows the bentoml code style, both ./dev/format.sh and
./dev/lint.sh script have passed
(instructions).
My change reduces project test coverage and requires unit tests to be added
I have added unit tests covering my code change
My change requires a change to the documentation
I have updated the documentation accordingly

ssheng · 2021-02-16T11:27:43Z

bentoml/yatai/deployment/sagemaker/serve

+    container = BentoMLContainer()
+    config = BentoMLConfiguration()
+    if "BENTOML_GUNICORN_TIMEOUT" in os.environ:
+        config.override(["api_server", "port"], int(os.environ.get("BENTOML_GUNICORN_TIMEOUT")))
+    if "BENTOML_GUNICORN_NUM_OF_WORKERS" in os.environ:
+        config.override(["api_server", "workers"], int(os.environ.get("BENTOML_GUNICORN_NUM_OF_WORKERS")))
+    container.config.from_dict(config.as_dict())
+
+    from bentoml import service, yatai
+
+    container.wire(packages=[service, yatai])
+


What's the best way to test this? @parano @yubozhao @bojiang

We don't have a good way right now. Deployment tests (e2e_tests) are running offline right now.

We need to refactor/improve the sagemaker deployment with better gunicorn supports anyway. I am fine that we continue to do the basic testings on the e2e test

codecov · 2021-02-16T11:29:08Z

Codecov Report

Merging #1454 (431d5d9) into master (13e5388) will increase coverage by 0.21%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1454      +/-   ##
==========================================
+ Coverage   67.54%   67.75%   +0.21%     
==========================================
  Files         149      149              
  Lines       10031    10008      -23     
==========================================
+ Hits         6775     6781       +6     
+ Misses       3256     3227      -29

Impacted Files	Coverage Δ
bentoml/configuration/containers.py	`86.48% <100.00%> (-2.25%)`	⬇️
bentoml/server/instruments.py	`76.74% <100.00%> (+0.55%)`	⬆️
bentoml/yatai/deployment/sagemaker/operator.py	`77.03% <0.00%> (-2.64%)`	⬇️
bentoml/yatai/deployment/aws_ec2/operator.py	`68.81% <0.00%> (-0.38%)`	⬇️
bentoml/saved_bundle/config.py	`88.80% <0.00%> (-0.09%)`	⬇️
bentoml/clipper/__init__.py	`0.00% <0.00%> (ø)`
bentoml/cli/bento_service.py	`82.46% <0.00%> (ø)`
bentoml/yatai/deployment/docker_utils.py	`94.73% <0.00%> (ø)`
bentoml/service/__init__.py	`75.42% <0.00%> (+0.42%)`	⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 13e5388...431d5d9. Read the comment docs.

…agemaker/serve.

parano · 2021-02-23T23:25:14Z

bentoml/yatai/deployment/sagemaker/serve

+    container = BentoMLContainer()
+    config = BentoMLConfiguration()
+    if "BENTOML_GUNICORN_TIMEOUT" in os.environ:
+        config.override(["api_server", "port"], int(os.environ.get("BENTOML_GUNICORN_TIMEOUT")))


is this a typo? s/port/timeout

Good catch. This was a typo. We should think about adding unit test for this.

parano · 2021-02-23T23:31:30Z

bentoml/yatai/deployment/sagemaker/serve

+@inject
+def _serve(
+    bento_server_timeout: int = Provide[BentoMLContainer.config.api_server.timeout],
+    bento_server_workers: int = Provide[BentoMLContainer.api_server_workers],


We can also remove the CPU core related calculation in BentoMLContainer.api_server_workers right?

We we still want to maintain the behavior to automatically determine works if workers is set to None?

parano · 2021-02-25T03:32:25Z

Thanks, @ssheng, I'm merging this PR now, feel free to address the issue we just discussed in a follow-up PR.

…1454) * Update the default API server GUnicorn workers to 1. * Update dependency injection wire functionality for yatai/deployment/sagemaker/serve. * Update wire modules in yatai/deployment/sagemaker/serve. * Use app_server_workers in yatai/deployment/sagemaker/serve. * Fix typo.

Update the default API server GUnicorn workers to 1.

3b4a9fb

ssheng commented Feb 16, 2021

View reviewed changes

ssheng requested review from parano and yubozhao February 16, 2021 11:38

yubozhao previously approved these changes Feb 17, 2021

View reviewed changes

Update dependency injection wire functionality for yatai/deployment/s…

e0f922a

…agemaker/serve.

ssheng dismissed yubozhao’s stale review via e0f922a February 21, 2021 10:43

ssheng added 2 commits February 21, 2021 03:25

Update wire modules in yatai/deployment/sagemaker/serve.

ce6191c

Use app_server_workers in yatai/deployment/sagemaker/serve.

116505a

ssheng changed the title ~~Update the default API GUnicorn server workers count to one~~ Update the default Gunicorn API server workers count to one Feb 22, 2021

parano reviewed Feb 23, 2021

View reviewed changes

Fix typo.

431d5d9

parano merged commit cc104f9 into bentoml:master Feb 25, 2021

ssheng deleted the default-workers branch January 9, 2022 11:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the default Gunicorn API server workers count to one #1454

Update the default Gunicorn API server workers count to one #1454

ssheng commented Feb 16, 2021

ssheng Feb 16, 2021

yubozhao Feb 17, 2021

codecov bot commented Feb 16, 2021 •

edited

Loading

parano Feb 23, 2021

ssheng Feb 23, 2021

ssheng Feb 24, 2021

parano Feb 23, 2021

ssheng Feb 23, 2021

parano commented Feb 25, 2021

Update the default Gunicorn API server workers count to one #1454

Update the default Gunicorn API server workers count to one #1454

Conversation

ssheng commented Feb 16, 2021

Description

Motivation and Context

How Has This Been Tested?

Types of changes

Component(s) if applicable

Checklist:

ssheng Feb 16, 2021

Choose a reason for hiding this comment

yubozhao Feb 17, 2021

Choose a reason for hiding this comment

codecov bot commented Feb 16, 2021 • edited Loading

Codecov Report

parano Feb 23, 2021

Choose a reason for hiding this comment

ssheng Feb 23, 2021

Choose a reason for hiding this comment

ssheng Feb 24, 2021

Choose a reason for hiding this comment

parano Feb 23, 2021

Choose a reason for hiding this comment

ssheng Feb 23, 2021

Choose a reason for hiding this comment

parano commented Feb 25, 2021

codecov bot commented Feb 16, 2021 •

edited

Loading